New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trac #768 - Add GEOSSTRtree_nearest to CAPI #61
Conversation
@strk how does one run the GEOS unit tests through valgrind? When I intentionally introduce a memory leak, it doesn't show up with |
I had no idea you could use "ctest" (I dont' use cmake).
|
Thanks, that did the trick. |
@strk any thoughts about this pull request? |
@dbaston Since there is no MemCheck config in the CMakeLists.txt, you would have to patch it using CTEST_MEMORYCHECK_* variables, then it would be possible to use CTest MemCheck. |
Could you add some documentation about the new CAPI entry ? Also, can we avoid the void pointers in the distance function ? |
Yes, some documentation is clearly needed. The idea with the void pointers (and the purpose of a user providing their own distance function) is because we don't know the type of the objects stored in the STRtree (PostGIS stores LWGEOMS in them, for example). This is similar to the use of void pointers in the user-provided callback to Looking at this again after quite a bit of time away, I can see that the way I exposed this flexibility in the CAPI is flawed. There's currently no way you could query a tree of LWGEOMS. I think the signature actually needs to be:
Is there a better way to do it? |
On Thu, Apr 14, 2016 at 06:10:32AM -0700, Dan Baston wrote:
Until I see the documentation I can't tell if there's a better way to Note that I didn't implement the GEOSSTRtree interface myself so that In your suggested signature, is the |
Yes,
|
Yes. It's similar to the JTS implementation, which stores |
On Thu, Apr 14, 2016 at 09:04:33AM -0700, Dan Baston wrote:
You meant "itemEnvelope" here, I guess.
Will the default GEOS geometry distance assume that the void pointer
Could the function be extended to return the N nearest items ? |
Yes
Let's say "must", but I think a better solution is to have two signatures (see below)
Yes. Again, resolved by having two signatures.
Presumably, but I've never seen it documented what that invariant is. I've never used anything other than a plan-old distance function
Unfortunately not, I think that change would have to happen at the JTS level. Since this is proving to be a bit cumbersome in the most likely case, where the tree is in fact storing GEOSGeometry objects, maybe there should be two functions:
|
An empty tree would then throw an exception ? Or the input geometry ?
I'd say that "itemEnvelop" is the object with which the tree is
|
Yes, an empty tree or input geometry should throw an exception and return NULL. I will add test cases for this.
I don't want to provide too much detail on how the function is executed; the important thing (from a caller's perspective) is that it computes the distance between objects of the type that is stored in the tree, and with which the tree is queried. This is just coming from the JTS interface
I will change the CAPI to have both |
On Fri, Apr 15, 2016 at 06:14:32AM -0700, Dan Baston wrote:
I don't want to provide too much detail on how the function is executed; the important thing (from a caller's perspective) is that it computes the distance between objects of the type that is stored in the tree, and with which the tree is queried.
In that case the function would need another parameter to hold a
context. I mean, while in Java/C++ you can have "functor", in C
you need an explicit parameter to know from where the distance
is called. There could be different reasons to have a context,
for example a pointer to a custom error reporting function
or memory allocator or whatever.
So, if if I know the first parameter to my provided callback
will always be the pointer I passed to the function, then
I can use that pointer to hold whatever context I want.
|
I'm not understanding the need to provide a context in this particular case, but I'm no expert here and very happy to take your word for it. That said, I don't want to thread a void pointer all the way through this algorithm, breaking similarity with JTS, for what's probably a fringe use case. How about we just forget about supporting a "nearest" operation on anything other than GEOSGeometry? That leaves us with simply
|
So I went looking at the GEOSSTRtree interface to see what we have
now. I think it is good to be consistent.
There's a query function that takes a callback and passes a "userdata"
void pointer to the callback:
typedef void (*GEOSQueryCallback)(void *item, void *userdata);
That callback type is used for both GEOSSTRtree_query
and GEOSSTRtree_iterate, wheras both functions take the userdata
pointer as an argument (and pass it to the callback verbatim).
Unfortunately there's no unit test for the GEOSSTRtree functions,
nor for the underlying STRtree C++ class. It would be useful to
start there to understand the interface better.
I think to understand that all GEOSSTRtree functions exposed at
the C-API level only deal with indexing objects by their envelopes,
leaving it to the caller to tell them apart (not trying to extract
an envelope from an item). And they return items whose reported
envelopes intersect the envelope used for querying.
Now the "nearest" function is adding handling an additional object,
which is the item itself. And probably assuming something about the
relation between item distance and item envelope distance.
I find in the code that items are grouped togheter in groups
"boundables" based on envelope proximity. Finding the "nearest"
object is probably done based on envelope distance first, avoiding
to look inside nodes whose envelope distance is bigger than a
candidate distance found, and updating the candidate distance to the
closest distance within the node to the query item.
So I guess the invariant is that no item distance can ever be
smaller than item envelope distance.
Sorry if this analisys is confusing, as I actually hoped for it to
help in further shaping the API. Given you are currently allowed to
insert opaque objects in the index (associated to a GEOSGeometry
envelope) I do find it consistent to also allow using an opaque
function to compute item distance (as long as it is documented).
Even more so than assuming the item is a geometry.
|
@strk thanks for the review and analysis. This matches my understanding of the API too. (I'm happy to suggest some docstring comments for the rest of the STRtree functions too, after this is wrapped up). Looking at this some more, it looks like we can pass
|
Given the insert function does not make any assumption about
the inserted object being a geometry, I don't see making
the assumption from the nearest-query function as consistent.
How about just keping the custom-distance one ?
Also please add the invariant expectation for the distance
function.
|
OK, I've modified the PR to:
The motivation for having a GEOSGeometry-specific version of the "nearest" function was that the generic version is a bit confusing....for a GEOSGeometry you'd have to call it as |
From the documentation of the generic function it doesn't look like you can pass NULL as the distance function, right ? So to use a geometry-distance function you should still pass in your own distance function. If you want to keep the geometry-specific function you should document very carefully that it will assume that the items put in the STRtree objects are in fact geometries. Please also add tests for passing empty geometries (this one would be also interesting to do for the tree insertion functions, like passing empty envelope). Any reason not to use a typedef for the callback function ? As it is used for other callbacks, it would sound more consistent. |
Added a note to the docstring.
Added.
I remember that I found the typedef of |
Getting closer, I built and tested and all looked fine, but
after squash-rebasing to svn-trunk and taking a final look I found
this:
```
+ try
+ {
+ if (distancefn) {
+ struct CustomItemDistance : public ItemDistance {
+ CustomItemDistance(GEOSDistanceCallback p_distancefn,
void* p_userdata)
+ : m_distancefn(p_distancefn),
m_userdata(p_userdata) {}
+
+
```
Is that not c++11 only, to define a structure in a conditional block ?
|
My (very limited) understanding is that it only requires C++11 if you want to use it as a template parameter. |
Indeed building with -std=c++98 seems ok.
I'll push as-is, then maybe move the definition outside the try block
|
Merged as r4184 = 63d2554 (refs/remotes/trunk) |
Excellent - thanks for working with me on this! |
Thank you !
Looking forward for your next PR, maybe on Gogs this time :)
|
Includes tests for the new API and pre-existing STRtree API Closes #768 Patch by Daniel Baston <dbaston@gmail.com> via #61 git-svn-id: http://svn.osgeo.org/geos/trunk@4184 5242fede-7e19-0410-aef8-94bd7d2200fb
The PR was incomplete, can you look at it @dbaston ? |
https://trac.osgeo.org/geos/ticket/768
This is a necessary precursor to implementing MinimumClearance in GEOS/PostGIS, but useful in its own right.